Deterministically annealed mixture of experts models for statistical regression
نویسندگان
چکیده
A new and e ective design method is presented for statistical regression functions that belong to the class of mixture models. The class includes the hierarchical mixture of experts (HME) and the normalized radial basis functions (NRBF). Design algorithms based on the maximum likelihood (ML) approach, which emphasize a probabilistic description of the model, have attracted much interest in HME and NRBF models. However, their design objective is mismatched to the original squared-error regression cost and the algorithms are easily trapped by poor local minima on the cost surface. In this paper, we propose an extension of the deterministic annealing (DA) method for the design of mixture-based regression models. We construct a probabilistic framework, but unlike the ML method, we directly optimize the squared-error regression cost, while avoiding poor local minima. Experimental results show that the DA method outperforms standard design methods for both HME and NRBF regression models. 1. MIXTURE OF EXPERTS REGRESSION In recent years, there has been growing interest in learning methods for regression functions that can be statistically interpreted as mixture models or mixture of experts (ME) models. The ME regression function takes the form: g(x) = X j P [jjx]f(x; j); (1) This work was supported in part by the National Science Foundationunder grant no. NCR-9314335, the University of California MICRO program, ACT Networks, Advanced Computer Communications, Stratacom, DSP Group, DSP Software Engineering, Fujitsu, General Electric Companuy, Hughes Electronics, Intel, Moseley Associates, National Semiconductor, Nokia Mobile Phones, Qualcomm, Rockwell International, and Texas Instruments. David Miller was supported by NSF Career Award NSF IRI-9624870 where P [jjx] is a non-negative weight of association between input, x and the jth \local expert regression function", f(x; j). Each local expert, f(x; j), is usually a constant, linear or simple nonlinear function of x and depends on the parameter set, j. The weights of association can be naturally interpreted as a probability distribution since P
منابع مشابه
An Overview of the New Feature Selection Methods in Finite Mixture of Regression Models
Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...
متن کاملThe Family of Scale-Mixture of Skew-Normal Distributions and Its Application in Bayesian Nonlinear Regression Models
In previous studies on fitting non-linear regression models with the symmetric structure the normality is usually assumed in the analysis of data. This choice may be inappropriate when the distribution of residual terms is asymmetric. Recently, the family of scale-mixture of skew-normal distributions is the main concern of many researchers. This family includes several skewed and heavy-tailed d...
متن کاملComparison of Ensemble Approaches: Mixture of Experts and AdaBoost for a Regression Problem
Two machine learning approaches: mixture of experts and AdaBoost.R2 were adjusted to the real-world regression problem of predicting the prices of residential premises based on historical data of sales/purchase transactions. The computationally intensive experiments were conducted aimed to compare empirically the prediction accuracy of ensemble models generated by the methods. The analysis of t...
متن کاملMixture of experts regression modeling by deterministic annealing
We propose a new learning algorithm for regression modeling. The method is especially suitable for optimizing neural network structures that are amenable to a statistical description as mixture models. These include mixture of experts, hierarchical mixture of experts (HME), and normalized radial basis functions (NRBF). Unlike recent maximum likelihood (ML) approaches, we directly minimize the (...
متن کاملUsing Regression based Control Limits and Probability Mixture Models for Monitoring Customer Behavior
In order to achieve the maximum flexibility in adaptation to ever changing customer’s expectations in customer relationship management, appropriate measures of customer behavior should be continually monitored. To this end, control charts adjusted for buyer’s/visitor’s prior intention to repurchase or visit again are suitable means taking into account the heterogeneity across customers. In the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997